1. About this Competition

This is the description about the data and the diagram explain the relationship between each data files This was copied directly from kaggle website https://www.kaggle.com/c/home-credit-default-risk/data

application_{train|test}.csv

This is the main table, broken into two files for Train (with TARGET) and Test (without TARGET). Static data for all applications. One row represents one loan in our data sample.

bureau.csv

All client’s previous credits provided by other financial institutions that were reported to Credit Bureau (for clients who have a loan in our sample). For every loan in our sample, there are as many rows as number of credits the client had in Credit Bureau before the application date.

bureau_balance.csv

Monthly balances of previous credits in Credit Bureau. This table has one row for each month of history of every previous credit reported to Credit Bureau - i.e the table has (#loans in sample * # of relative previous credits * # of months where we have some history observable for the previous credits) rows.

POS_CASH_balance.csv

Monthly balance snapshots of previous POS (point of sales) and cash loans that the applicant had with Home Credit. This table has one row for each month of history of every previous credit in Home Credit (consumer credit and cash loans) related to loans in our sample - i.e. the table has (#loans in sample * # of relative previous credits * # of months in which we have some history observable for the previous credits) rows.

credit_card_balance.csv

Monthly balance snapshots of previous credit cards that the applicant has with Home Credit. This table has one row for each month of history of every previous credit in Home Credit (consumer credit and cash loans) related to loans in our sample - i.e. the table has (#loans in sample * # of relative previous credit cards * # of months where we have some history observable for the previous credit card) rows.

previous_application.csv

All previous applications for Home Credit loans of clients who have loans in our sample. There is one row for each previous application related to loans in our data sample.

installments_payments.csv

Repayment history for the previously disbursed credits in Home Credit related to the loans in our sample. There is a) one row for every payment that was made plus b) one row each for missed payment. One row is equivalent to one payment of one installment OR one installment corresponding to one payment of one previous Home Credit credit related to loans in our sample. HomeCredit_columns_description.csv

This file contains descriptions for the columns in the various data files. Data data relationship diagram

2. EDA train and test data

reading the train and test data

trainLocation <- "data/application_train.csv"
testLocation <- "data/application_test.csv"
library(tidyverse)
trainData <- read_csv(trainLocation)
testData <- read_csv(testLocation)
trainData
testData

There are around 307511 observations in the train data set and 48744 observation in the tests data set so around 14% of the data is used as test data and 86% is train data. Train data and test data have 121 features, extra 1 target variable in train data. Most features are numeric with type integer and double, some of them are categorical with type character. There are also a lot of NA in the data

Check if there are any constant features in train data so we can remove

library(dplyr)
trainData %>% summarise_all(funs(n_distinct(.)))

Seem like every features have at least 2 distinct value so no constant value

Process numeric features

library(purrr)
library(tidyr)
library(ggplot2)
library(dplyr)
library(tidyimpute)
# get the absolute value of numberic feature
numSet1 <- trainData %>% keep(is.numeric) %>% abs 
# replace missing value with the median value of the feature
numSetNoNa <- numSet1 %>% impute_median()  
numSetNoNa
summary(numSetNoNa)
   SK_ID_CURR         TARGET         CNT_CHILDREN     AMT_INCOME_TOTAL      AMT_CREDIT       AMT_ANNUITY     AMT_GOODS_PRICE   REGION_POPULATION_RELATIVE   DAYS_BIRTH    DAYS_EMPLOYED    DAYS_REGISTRATION DAYS_ID_PUBLISH
 Min.   :100002   Min.   :0.00000   Min.   : 0.0000   Min.   :    25650   Min.   :  45000   Min.   :  1616   Min.   :  40500   Min.   :0.00029            Min.   : 7489   Min.   :     0   Min.   :    0     Min.   :   0   
 1st Qu.:189146   1st Qu.:0.00000   1st Qu.: 0.0000   1st Qu.:   112500   1st Qu.: 270000   1st Qu.: 16524   1st Qu.: 238500   1st Qu.:0.01001            1st Qu.:12413   1st Qu.:   933   1st Qu.: 2010     1st Qu.:1720   
 Median :278202   Median :0.00000   Median : 0.0000   Median :   147150   Median : 513531   Median : 24903   Median : 450000   Median :0.01885            Median :15750   Median :  2219   Median : 4504     Median :3254   
 Mean   :278181   Mean   :0.08073   Mean   : 0.4171   Mean   :   168798   Mean   : 599026   Mean   : 27108   Mean   : 538316   Mean   :0.02087            Mean   :16037   Mean   : 67725   Mean   : 4986     Mean   :2994   
 3rd Qu.:367143   3rd Qu.:0.00000   3rd Qu.: 1.0000   3rd Qu.:   202500   3rd Qu.: 808650   3rd Qu.: 34596   3rd Qu.: 679500   3rd Qu.:0.02866            3rd Qu.:19682   3rd Qu.:  5707   3rd Qu.: 7480     3rd Qu.:4299   
 Max.   :456255   Max.   :1.00000   Max.   :19.0000   Max.   :117000000   Max.   :4050000   Max.   :258026   Max.   :4050000   Max.   :0.07251            Max.   :25229   Max.   :365243   Max.   :24672     Max.   :7197   
  OWN_CAR_AGE      FLAG_MOBIL FLAG_EMP_PHONE   FLAG_WORK_PHONE  FLAG_CONT_MOBILE   FLAG_PHONE       FLAG_EMAIL      CNT_FAM_MEMBERS  REGION_RATING_CLIENT REGION_RATING_CLIENT_W_CITY HOUR_APPR_PROCESS_START
 Min.   : 0.00   Min.   :0    Min.   :0.0000   Min.   :0.0000   Min.   :0.0000   Min.   :0.0000   Min.   :0.00000   Min.   : 1.000   Min.   :1.000        Min.   :1.000               Min.   : 0.00          
 1st Qu.: 9.00   1st Qu.:1    1st Qu.:1.0000   1st Qu.:0.0000   1st Qu.:1.0000   1st Qu.:0.0000   1st Qu.:0.00000   1st Qu.: 2.000   1st Qu.:2.000        1st Qu.:2.000               1st Qu.:10.00          
 Median : 9.00   Median :1    Median :1.0000   Median :0.0000   Median :1.0000   Median :0.0000   Median :0.00000   Median : 2.000   Median :2.000        Median :2.000               Median :12.00          
 Mean   :10.04   Mean   :1    Mean   :0.8199   Mean   :0.1994   Mean   :0.9981   Mean   :0.2811   Mean   :0.05672   Mean   : 2.153   Mean   :2.052        Mean   :2.032               Mean   :12.06          
 3rd Qu.: 9.00   3rd Qu.:1    3rd Qu.:1.0000   3rd Qu.:0.0000   3rd Qu.:1.0000   3rd Qu.:1.0000   3rd Qu.:0.00000   3rd Qu.: 3.000   3rd Qu.:2.000        3rd Qu.:2.000               3rd Qu.:14.00          
 Max.   :91.00   Max.   :1    Max.   :1.0000   Max.   :1.0000   Max.   :1.0000   Max.   :1.0000   Max.   :1.00000   Max.   :20.000   Max.   :3.000        Max.   :3.000               Max.   :23.00          
 REG_REGION_NOT_LIVE_REGION REG_REGION_NOT_WORK_REGION LIVE_REGION_NOT_WORK_REGION REG_CITY_NOT_LIVE_CITY REG_CITY_NOT_WORK_CITY LIVE_CITY_NOT_WORK_CITY  EXT_SOURCE_1      EXT_SOURCE_2        EXT_SOURCE_3      
 Min.   :0.00000            Min.   :0.00000            Min.   :0.00000             Min.   :0.00000        Min.   :0.0000         Min.   :0.0000          Min.   :0.01457   Min.   :0.0000001   Min.   :0.0005273  
 1st Qu.:0.00000            1st Qu.:0.00000            1st Qu.:0.00000             1st Qu.:0.00000        1st Qu.:0.0000         1st Qu.:0.0000          1st Qu.:0.50600   1st Qu.:0.3929737   1st Qu.:0.4170997  
 Median :0.00000            Median :0.00000            Median :0.00000             Median :0.00000        Median :0.0000         Median :0.0000          Median :0.50600   Median :0.5659614   Median :0.5352763  
 Mean   :0.01514            Mean   :0.05077            Mean   :0.04066             Mean   :0.07817        Mean   :0.2305         Mean   :0.1796          Mean   :0.50431   Mean   :0.5145034   Mean   :0.5156949  
 3rd Qu.:0.00000            3rd Qu.:0.00000            3rd Qu.:0.00000             3rd Qu.:0.00000        3rd Qu.:0.0000         3rd Qu.:0.0000          3rd Qu.:0.50600   3rd Qu.:0.6634218   3rd Qu.:0.6363762  
 Max.   :1.00000            Max.   :1.00000            Max.   :1.00000             Max.   :1.00000        Max.   :1.0000         Max.   :1.0000          Max.   :0.96269   Max.   :0.8549997   Max.   :0.8960095  
 APARTMENTS_AVG   BASEMENTAREA_AVG  YEARS_BEGINEXPLUATATION_AVG YEARS_BUILD_AVG  COMMONAREA_AVG    ELEVATORS_AVG     ENTRANCES_AVG    FLOORSMAX_AVG    FLOORSMIN_AVG     LANDAREA_AVG     LIVINGAPARTMENTS_AVG
 Min.   :0.0000   Min.   :0.00000   Min.   :0.0000              Min.   :0.0000   Min.   :0.00000   Min.   :0.00000   Min.   :0.0000   Min.   :0.0000   Min.   :0.0000   Min.   :0.00000   Min.   :0.00000     
 1st Qu.:0.0876   1st Qu.:0.07630   1st Qu.:0.9816              1st Qu.:0.7552   1st Qu.:0.02110   1st Qu.:0.00000   1st Qu.:0.1379   1st Qu.:0.1667   1st Qu.:0.2083   1st Qu.:0.04810   1st Qu.:0.07560     
 Median :0.0876   Median :0.07630   Median :0.9816              Median :0.7552   Median :0.02110   Median :0.00000   Median :0.1379   Median :0.1667   Median :0.2083   Median :0.04810   Median :0.07560     
 Mean   :0.1023   Mean   :0.08134   Mean   :0.9796              Mean   :0.7543   Mean   :0.02819   Mean   :0.03687   Mean   :0.1438   Mean   :0.1966   Mean   :0.2159   Mean   :0.05551   Mean   :0.08357     
 3rd Qu.:0.0876   3rd Qu.:0.07630   3rd Qu.:0.9821              3rd Qu.:0.7552   3rd Qu.:0.02110   3rd Qu.:0.00000   3rd Qu.:0.1379   3rd Qu.:0.1667   3rd Qu.:0.2083   3rd Qu.:0.04810   3rd Qu.:0.07560     
 Max.   :1.0000   Max.   :1.00000   Max.   :1.0000              Max.   :1.0000   Max.   :1.00000   Max.   :1.00000   Max.   :1.0000   Max.   :1.0000   Max.   :1.0000   Max.   :1.00000   Max.   :1.00000     
 LIVINGAREA_AVG    NONLIVINGAPARTMENTS_AVG NONLIVINGAREA_AVG APARTMENTS_MODE   BASEMENTAREA_MODE YEARS_BEGINEXPLUATATION_MODE YEARS_BUILD_MODE COMMONAREA_MODE  ELEVATORS_MODE    ENTRANCES_MODE   FLOORSMAX_MODE  
 Min.   :0.00000   Min.   :0.000000        Min.   :0.0000    Min.   :0.00000   Min.   :0.00000   Min.   :0.0000               Min.   :0.0000   Min.   :0.0000   Min.   :0.00000   Min.   :0.0000   Min.   :0.0000  
 1st Qu.:0.07450   1st Qu.:0.000000        1st Qu.:0.0036    1st Qu.:0.08400   1st Qu.:0.07460   1st Qu.:0.9811               1st Qu.:0.7648   1st Qu.:0.0190   1st Qu.:0.00000   1st Qu.:0.1379   1st Qu.:0.1667  
 Median :0.07450   Median :0.000000        Median :0.0036    Median :0.08400   Median :0.07460   Median :0.9816               Median :0.7648   Median :0.0190   Median :0.00000   Median :0.1379   Median :0.1667  
 Mean   :0.09089   Mean   :0.002693        Mean   :0.0147    Mean   :0.09889   Mean   :0.07997   Mean   :0.9793               Mean   :0.7631   Mean   :0.0261   Mean   :0.03479   Mean   :0.1415   Mean   :0.1946  
 3rd Qu.:0.07450   3rd Qu.:0.000000        3rd Qu.:0.0036    3rd Qu.:0.08400   3rd Qu.:0.07460   3rd Qu.:0.9816               3rd Qu.:0.7648   3rd Qu.:0.0190   3rd Qu.:0.00000   3rd Qu.:0.1379   3rd Qu.:0.1667  
 Max.   :1.00000   Max.   :1.000000        Max.   :1.0000    Max.   :1.00000   Max.   :1.00000   Max.   :1.0000               Max.   :1.0000   Max.   :1.0000   Max.   :1.00000   Max.   :1.0000   Max.   :1.0000  
 FLOORSMIN_MODE   LANDAREA_MODE     LIVINGAPARTMENTS_MODE LIVINGAREA_MODE   NONLIVINGAPARTMENTS_MODE NONLIVINGAREA_MODE APARTMENTS_MEDI  BASEMENTAREA_MEDI YEARS_BEGINEXPLUATATION_MEDI YEARS_BUILD_MEDI COMMONAREA_MEDI  
 Min.   :0.0000   Min.   :0.00000   Min.   :0.00000       Min.   :0.00000   Min.   :0.000000         Min.   :0.00000    Min.   :0.0000   Min.   :0.00000   Min.   :0.0000               Min.   :0.0000   Min.   :0.00000  
 1st Qu.:0.2083   1st Qu.:0.04580   1st Qu.:0.07710       1st Qu.:0.07310   1st Qu.:0.000000         1st Qu.:0.00110    1st Qu.:0.0864   1st Qu.:0.07580   1st Qu.:0.9816               1st Qu.:0.7585   1st Qu.:0.02080  
 Median :0.2083   Median :0.04580   Median :0.07710       Median :0.07310   Median :0.000000         Median :0.00110    Median :0.0864   Median :0.07580   Median :0.9816               Median :0.7585   Median :0.02080  
 Mean   :0.2147   Mean   :0.05358   Mean   :0.08613       Mean   :0.08947   Mean   :0.002469         Mean   :0.01272    Mean   :0.1019   Mean   :0.08084   Mean   :0.9796               Mean   :0.7576   Mean   :0.02797  
 3rd Qu.:0.2083   3rd Qu.:0.04580   3rd Qu.:0.07710       3rd Qu.:0.07310   3rd Qu.:0.000000         3rd Qu.:0.00110    3rd Qu.:0.0864   3rd Qu.:0.07580   3rd Qu.:0.9821               3rd Qu.:0.7585   3rd Qu.:0.02080  
 Max.   :1.0000   Max.   :1.00000   Max.   :1.00000       Max.   :1.00000   Max.   :1.000000         Max.   :1.00000    Max.   :1.0000   Max.   :1.00000   Max.   :1.0000               Max.   :1.0000   Max.   :1.00000  
 ELEVATORS_MEDI    ENTRANCES_MEDI   FLOORSMAX_MEDI   FLOORSMIN_MEDI   LANDAREA_MEDI    LIVINGAPARTMENTS_MEDI LIVINGAREA_MEDI   NONLIVINGAPARTMENTS_MEDI NONLIVINGAREA_MEDI TOTALAREA_MODE    OBS_30_CNT_SOCIAL_CIRCLE
 Min.   :0.00000   Min.   :0.0000   Min.   :0.0000   Min.   :0.0000   Min.   :0.0000   Min.   :0.00000       Min.   :0.00000   Min.   :0.000000         Min.   :0.00000    Min.   :0.00000   Min.   :  0.000         
 1st Qu.:0.00000   1st Qu.:0.1379   1st Qu.:0.1667   1st Qu.:0.2083   1st Qu.:0.0487   1st Qu.:0.07610       1st Qu.:0.07490   1st Qu.:0.000000         1st Qu.:0.00310    1st Qu.:0.06700   1st Qu.:  0.000         
 Median :0.00000   Median :0.1379   Median :0.1667   Median :0.2083   Median :0.0487   Median :0.07610       Median :0.07490   Median :0.000000         Median :0.00310    Median :0.06880   Median :  0.000         
 Mean   :0.03647   Mean   :0.1435   Mean   :0.1964   Mean   :0.2158   Mean   :0.0562   Mean   :0.08428       Mean   :0.09169   Mean   :0.002644         Mean   :0.01437    Mean   :0.08626   Mean   :  1.417         
 3rd Qu.:0.00000   3rd Qu.:0.1379   3rd Qu.:0.1667   3rd Qu.:0.2083   3rd Qu.:0.0487   3rd Qu.:0.07610       3rd Qu.:0.07490   3rd Qu.:0.000000         3rd Qu.:0.00310    3rd Qu.:0.07030   3rd Qu.:  2.000         
 Max.   :1.00000   Max.   :1.0000   Max.   :1.0000   Max.   :1.0000   Max.   :1.0000   Max.   :1.00000       Max.   :1.00000   Max.   :1.000000         Max.   :1.00000    Max.   :1.00000   Max.   :348.000         
 DEF_30_CNT_SOCIAL_CIRCLE OBS_60_CNT_SOCIAL_CIRCLE DEF_60_CNT_SOCIAL_CIRCLE DAYS_LAST_PHONE_CHANGE FLAG_DOCUMENT_2    FLAG_DOCUMENT_3 FLAG_DOCUMENT_4    FLAG_DOCUMENT_5   FLAG_DOCUMENT_6   FLAG_DOCUMENT_7    
 Min.   : 0.0000          Min.   :  0.000          Min.   : 0.00000         Min.   :   0.0         Min.   :0.00e+00   Min.   :0.00    Min.   :0.00e+00   Min.   :0.00000   Min.   :0.00000   Min.   :0.0000000  
 1st Qu.: 0.0000          1st Qu.:  0.000          1st Qu.: 0.00000         1st Qu.: 274.0         1st Qu.:0.00e+00   1st Qu.:0.00    1st Qu.:0.00e+00   1st Qu.:0.00000   1st Qu.:0.00000   1st Qu.:0.0000000  
 Median : 0.0000          Median :  0.000          Median : 0.00000         Median : 757.0         Median :0.00e+00   Median :1.00    Median :0.00e+00   Median :0.00000   Median :0.00000   Median :0.0000000  
 Mean   : 0.1429          Mean   :  1.401          Mean   : 0.09972         Mean   : 962.9         Mean   :4.23e-05   Mean   :0.71    Mean   :8.13e-05   Mean   :0.01511   Mean   :0.08806   Mean   :0.0001919  
 3rd Qu.: 0.0000          3rd Qu.:  2.000          3rd Qu.: 0.00000         3rd Qu.:1570.0         3rd Qu.:0.00e+00   3rd Qu.:1.00    3rd Qu.:0.00e+00   3rd Qu.:0.00000   3rd Qu.:0.00000   3rd Qu.:0.0000000  
 Max.   :34.0000          Max.   :344.000          Max.   :24.00000         Max.   :4292.0         Max.   :1.00e+00   Max.   :1.00    Max.   :1.00e+00   Max.   :1.00000   Max.   :1.00000   Max.   :1.0000000  
 FLAG_DOCUMENT_8   FLAG_DOCUMENT_9    FLAG_DOCUMENT_10   FLAG_DOCUMENT_11   FLAG_DOCUMENT_12  FLAG_DOCUMENT_13   FLAG_DOCUMENT_14   FLAG_DOCUMENT_15  FLAG_DOCUMENT_16   FLAG_DOCUMENT_17    FLAG_DOCUMENT_18 
 Min.   :0.00000   Min.   :0.000000   Min.   :0.00e+00   Min.   :0.000000   Min.   :0.0e+00   Min.   :0.000000   Min.   :0.000000   Min.   :0.00000   Min.   :0.000000   Min.   :0.0000000   Min.   :0.00000  
 1st Qu.:0.00000   1st Qu.:0.000000   1st Qu.:0.00e+00   1st Qu.:0.000000   1st Qu.:0.0e+00   1st Qu.:0.000000   1st Qu.:0.000000   1st Qu.:0.00000   1st Qu.:0.000000   1st Qu.:0.0000000   1st Qu.:0.00000  
 Median :0.00000   Median :0.000000   Median :0.00e+00   Median :0.000000   Median :0.0e+00   Median :0.000000   Median :0.000000   Median :0.00000   Median :0.000000   Median :0.0000000   Median :0.00000  
 Mean   :0.08138   Mean   :0.003896   Mean   :2.28e-05   Mean   :0.003912   Mean   :6.5e-06   Mean   :0.003525   Mean   :0.002936   Mean   :0.00121   Mean   :0.009928   Mean   :0.0002667   Mean   :0.00813  
 3rd Qu.:0.00000   3rd Qu.:0.000000   3rd Qu.:0.00e+00   3rd Qu.:0.000000   3rd Qu.:0.0e+00   3rd Qu.:0.000000   3rd Qu.:0.000000   3rd Qu.:0.00000   3rd Qu.:0.000000   3rd Qu.:0.0000000   3rd Qu.:0.00000  
 Max.   :1.00000   Max.   :1.000000   Max.   :1.00e+00   Max.   :1.000000   Max.   :1.0e+00   Max.   :1.000000   Max.   :1.000000   Max.   :1.00000   Max.   :1.000000   Max.   :1.0000000   Max.   :1.00000  
 FLAG_DOCUMENT_19    FLAG_DOCUMENT_20    FLAG_DOCUMENT_21    AMT_REQ_CREDIT_BUREAU_HOUR AMT_REQ_CREDIT_BUREAU_DAY AMT_REQ_CREDIT_BUREAU_WEEK AMT_REQ_CREDIT_BUREAU_MON AMT_REQ_CREDIT_BUREAU_QRT AMT_REQ_CREDIT_BUREAU_YEAR
 Min.   :0.0000000   Min.   :0.0000000   Min.   :0.0000000   Min.   :0.000000           Min.   :0.000000          Min.   :0.00000            Min.   : 0.0000           Min.   :  0.0000          Min.   : 0.000            
 1st Qu.:0.0000000   1st Qu.:0.0000000   1st Qu.:0.0000000   1st Qu.:0.000000           1st Qu.:0.000000          1st Qu.:0.00000            1st Qu.: 0.0000           1st Qu.:  0.0000          1st Qu.: 1.000            
 Median :0.0000000   Median :0.0000000   Median :0.0000000   Median :0.000000           Median :0.000000          Median :0.00000            Median : 0.0000           Median :  0.0000          Median : 1.000            
 Mean   :0.0005951   Mean   :0.0005073   Mean   :0.0003349   Mean   :0.005538           Mean   :0.006055          Mean   :0.02972            Mean   : 0.2313           Mean   :  0.2296          Mean   : 1.778            
 3rd Qu.:0.0000000   3rd Qu.:0.0000000   3rd Qu.:0.0000000   3rd Qu.:0.000000           3rd Qu.:0.000000          3rd Qu.:0.00000            3rd Qu.: 0.0000           3rd Qu.:  0.0000          3rd Qu.: 3.000            
 Max.   :1.0000000   Max.   :1.0000000   Max.   :1.0000000   Max.   :4.000000           Max.   :9.000000          Max.   :8.00000            Max.   :27.0000           Max.   :261.0000          Max.   :25.000            

numSetNoNa contain 106 features including the target variable. This data have no missing value

Process Categorical features

# get the categorical features
catSet <- trainData %>% keep(is.character)
# use the mode of each features to replace its missing value
catSetNoNa <- catSet %>% impute_most_freq
# label encoding categorical features
catSetFactor <- catSetNoNa %>% mutate_all(funs(as.factor))
catSetLabelEncode <- catSetFactor %>% mutate_all(funs(as.numeric))
catSetLabelEncode

Plotting

Create two functions for plotting that we use later

plotDistribtuion <- function(data,plot_type){
    data %>%
    gather() %>% 
    ggplot(aes(value)) + 
    facet_wrap(~key,scales = "free") + 
    plot_type()
}
plotDistribtuionCoordFlip <- function(data,plot_type){
    data %>%
    gather() %>% 
    ggplot(aes(value)) + 
    facet_wrap(~key,scales = "free") + 
    plot_type() + coord_flip()
}

We are splitting the numeric features into smaller chunk for plotting

NumCol_1 <- numSetNoNa %>% select(1:9)
NumCol_2 <- numSetNoNa %>% select(10:19)
NumCol_3 <- numSetNoNa %>% select(20:29)
NumCol_4 <- numSetNoNa %>% select(30:39)
NumCol_5 <- numSetNoNa %>% select(40:49)
NumCol_6 <- numSetNoNa %>% select(50:59)
NumCol_7 <- numSetNoNa %>% select(60:69)
NumCol_8 <- numSetNoNa %>% select(70:79)
NumCol_9 <- numSetNoNa %>% select(80:89)
NumCol_10 <- numSetNoNa %>% select(90:99)
NumCol_11 <- numSetNoNa %>% select(100:106)

Plot histogram of numeric features

plotDistribtuion((NumCol_1),geom_histogram)

plotDistribtuion(NumCol_2,geom_histogram)

plotDistribtuion(NumCol_3,geom_histogram)

plotDistribtuion(NumCol_4,geom_histogram)

plotDistribtuion(NumCol_5,geom_histogram)

plotDistribtuion(NumCol_6,geom_histogram)

plotDistribtuion(NumCol_7,geom_histogram)

plotDistribtuion(NumCol_8,geom_histogram)

plotDistribtuion(NumCol_9,geom_histogram)

plotDistribtuion(NumCol_10,geom_histogram)

plotDistribtuion(NumCol_11,geom_histogram)

we have a lot of right skew features and binary features

We are splitting the categorical features into smaller chunk for plotting

mutiTypeCol <- c("NAME_CONTRACT_TYPE","NAME_EDUCATION_TYPE","NAME_FAMILY_STATUS","NAME_HOUSING_TYPE","NAME_INCOME_TYPE","NAME_TYPE_SUITE","OCCUPATION_TYPE","WEEKDAY_APPR_PROCESS_START","FONDKAPREMONT_MODE","WALLSMATERIAL_MODE","HOUSETYPE_MODE")
catCol1 <- catSetNoNa %>% select(mutiTypeCol)
catCol2 <- catSetNoNa %>% select(-mutiTypeCol,-"ORGANIZATION_TYPE")
catCol3 <- catSetNoNa %>% select("ORGANIZATION_TYPE")

Plot bar chart of numeric features

plotDistribtuionCoordFlip(catCol1,geom_bar)

plotDistribtuion(catCol2,geom_bar)

plotDistribtuionCoordFlip(catCol3,geom_bar)

We picks some features that we think it is important to graph the correlation

corFeatures <- numSetNoNa %>% select(TARGET,CNT_CHILDREN,AMT_INCOME_TOTAL,AMT_CREDIT,AMT_ANNUITY,AMT_GOODS_PRICE,CNT_FAM_MEMBERS,OBS_30_CNT_SOCIAL_CIRCLE,DEF_30_CNT_SOCIAL_CIRCLE,OBS_60_CNT_SOCIAL_CIRCLE,DEF_60_CNT_SOCIAL_CIRCLE)
library(corrplot)
M <- round(cor(corFeatures),2)
corrplot(M, method = "number")

If two features are highly correlated to each other we can remove one of them later when building the model

numOnly[266367, 11]
[1] 10116.04

make test data ready like we did with train data

testNumSet <- testData %>% keep(is.numeric) %>% abs
testNumSetNoNa <- testNumSet %>% impute_median() 
testCatSet <- testData %>% keep(is.character)
testCatSetNoNa <- testCatSet %>% impute_most_freq
testCatSetFactor <- testCatSetNoNa %>% mutate_all(funs(as.factor))
testCatSetLabelEncode <- testCatSetFactor %>% mutate_all(funs(as.numeric))
readyTestData <- cbind(testNumSetNoNa,testCatSetLabelEncode)
write.csv(readyTestData, file = "data/testReady.csv",row.names=FALSE)
LS0tDQp0aXRsZTogIkRlZmF1bHQgQ3JlZGl0IFJpc2sgTm90ZWJvb2siDQpvdXRwdXQ6IGh0bWxfbm90ZWJvb2sNCi0tLQ0KDQoqKjEuIEFib3V0IHRoaXMgQ29tcGV0aXRpb24qKg0KDQpUaGlzIGlzIHRoZSBkZXNjcmlwdGlvbiBhYm91dCB0aGUgZGF0YSBhbmQgdGhlIGRpYWdyYW0gZXhwbGFpbiB0aGUgcmVsYXRpb25zaGlwIGJldHdlZW4gZWFjaCBkYXRhIGZpbGVzDQpUaGlzIHdhcyBjb3BpZWQgZGlyZWN0bHkgZnJvbSBrYWdnbGUgd2Vic2l0ZSBodHRwczovL3d3dy5rYWdnbGUuY29tL2MvaG9tZS1jcmVkaXQtZGVmYXVsdC1yaXNrL2RhdGEgDQoNCg0KKiphcHBsaWNhdGlvbl97dHJhaW58dGVzdH0uY3N2KioNCg0KVGhpcyBpcyB0aGUgbWFpbiB0YWJsZSwgYnJva2VuIGludG8gdHdvIGZpbGVzIGZvciBUcmFpbiAod2l0aCBUQVJHRVQpIGFuZCBUZXN0ICh3aXRob3V0IFRBUkdFVCkuDQpTdGF0aWMgZGF0YSBmb3IgYWxsIGFwcGxpY2F0aW9ucy4gT25lIHJvdyByZXByZXNlbnRzIG9uZSBsb2FuIGluIG91ciBkYXRhIHNhbXBsZS4NCg0KKipidXJlYXUuY3N2KioNCg0KQWxsIGNsaWVudCdzIHByZXZpb3VzIGNyZWRpdHMgcHJvdmlkZWQgYnkgb3RoZXIgZmluYW5jaWFsIGluc3RpdHV0aW9ucyB0aGF0IHdlcmUgcmVwb3J0ZWQgdG8gQ3JlZGl0IEJ1cmVhdSAoZm9yIGNsaWVudHMgd2hvIGhhdmUgYSBsb2FuIGluIG91ciBzYW1wbGUpLg0KRm9yIGV2ZXJ5IGxvYW4gaW4gb3VyIHNhbXBsZSwgdGhlcmUgYXJlIGFzIG1hbnkgcm93cyBhcyBudW1iZXIgb2YgY3JlZGl0cyB0aGUgY2xpZW50IGhhZCBpbiBDcmVkaXQgQnVyZWF1IGJlZm9yZSB0aGUgYXBwbGljYXRpb24gZGF0ZS4NCg0KKipidXJlYXVfYmFsYW5jZS5jc3YqKg0KDQpNb250aGx5IGJhbGFuY2VzIG9mIHByZXZpb3VzIGNyZWRpdHMgaW4gQ3JlZGl0IEJ1cmVhdS4NClRoaXMgdGFibGUgaGFzIG9uZSByb3cgZm9yIGVhY2ggbW9udGggb2YgaGlzdG9yeSBvZiBldmVyeSBwcmV2aW91cyBjcmVkaXQgcmVwb3J0ZWQgdG8gQ3JlZGl0IEJ1cmVhdSAtIGkuZSB0aGUgdGFibGUgaGFzICgjbG9hbnMgaW4gc2FtcGxlICogIyBvZiByZWxhdGl2ZSBwcmV2aW91cyBjcmVkaXRzICogIyBvZiBtb250aHMgd2hlcmUgd2UgaGF2ZSBzb21lIGhpc3Rvcnkgb2JzZXJ2YWJsZSBmb3IgdGhlIHByZXZpb3VzIGNyZWRpdHMpIHJvd3MuDQoNCioqUE9TX0NBU0hfYmFsYW5jZS5jc3YqKg0KDQpNb250aGx5IGJhbGFuY2Ugc25hcHNob3RzIG9mIHByZXZpb3VzIFBPUyAocG9pbnQgb2Ygc2FsZXMpIGFuZCBjYXNoIGxvYW5zIHRoYXQgdGhlIGFwcGxpY2FudCBoYWQgd2l0aCBIb21lIENyZWRpdC4NClRoaXMgdGFibGUgaGFzIG9uZSByb3cgZm9yIGVhY2ggbW9udGggb2YgaGlzdG9yeSBvZiBldmVyeSBwcmV2aW91cyBjcmVkaXQgaW4gSG9tZSBDcmVkaXQgKGNvbnN1bWVyIGNyZWRpdCBhbmQgY2FzaCBsb2FucykgcmVsYXRlZCB0byBsb2FucyBpbiBvdXIgc2FtcGxlIC0gaS5lLiB0aGUgdGFibGUgaGFzICgjbG9hbnMgaW4gc2FtcGxlICogIyBvZiByZWxhdGl2ZSBwcmV2aW91cyBjcmVkaXRzICogIyBvZiBtb250aHMgaW4gd2hpY2ggd2UgaGF2ZSBzb21lIGhpc3Rvcnkgb2JzZXJ2YWJsZSBmb3IgdGhlIHByZXZpb3VzIGNyZWRpdHMpIHJvd3MuDQoNCioqY3JlZGl0X2NhcmRfYmFsYW5jZS5jc3YqKg0KDQpNb250aGx5IGJhbGFuY2Ugc25hcHNob3RzIG9mIHByZXZpb3VzIGNyZWRpdCBjYXJkcyB0aGF0IHRoZSBhcHBsaWNhbnQgaGFzIHdpdGggSG9tZSBDcmVkaXQuDQpUaGlzIHRhYmxlIGhhcyBvbmUgcm93IGZvciBlYWNoIG1vbnRoIG9mIGhpc3Rvcnkgb2YgZXZlcnkgcHJldmlvdXMgY3JlZGl0IGluIEhvbWUgQ3JlZGl0IChjb25zdW1lciBjcmVkaXQgYW5kIGNhc2ggbG9hbnMpIHJlbGF0ZWQgdG8gbG9hbnMgaW4gb3VyIHNhbXBsZSAtIGkuZS4gdGhlIHRhYmxlIGhhcyAoI2xvYW5zIGluIHNhbXBsZSAqICMgb2YgcmVsYXRpdmUgcHJldmlvdXMgY3JlZGl0IGNhcmRzICogIyBvZiBtb250aHMgd2hlcmUgd2UgaGF2ZSBzb21lIGhpc3Rvcnkgb2JzZXJ2YWJsZSBmb3IgdGhlIHByZXZpb3VzIGNyZWRpdCBjYXJkKSByb3dzLg0KDQoqKnByZXZpb3VzX2FwcGxpY2F0aW9uLmNzdioqDQoNCkFsbCBwcmV2aW91cyBhcHBsaWNhdGlvbnMgZm9yIEhvbWUgQ3JlZGl0IGxvYW5zIG9mIGNsaWVudHMgd2hvIGhhdmUgbG9hbnMgaW4gb3VyIHNhbXBsZS4NClRoZXJlIGlzIG9uZSByb3cgZm9yIGVhY2ggcHJldmlvdXMgYXBwbGljYXRpb24gcmVsYXRlZCB0byBsb2FucyBpbiBvdXIgZGF0YSBzYW1wbGUuDQoNCioqaW5zdGFsbG1lbnRzX3BheW1lbnRzLmNzdioqDQoNClJlcGF5bWVudCBoaXN0b3J5IGZvciB0aGUgcHJldmlvdXNseSBkaXNidXJzZWQgY3JlZGl0cyBpbiBIb21lIENyZWRpdCByZWxhdGVkIHRvIHRoZSBsb2FucyBpbiBvdXIgc2FtcGxlLg0KVGhlcmUgaXMgYSkgb25lIHJvdyBmb3IgZXZlcnkgcGF5bWVudCB0aGF0IHdhcyBtYWRlIHBsdXMgYikgb25lIHJvdyBlYWNoIGZvciBtaXNzZWQgcGF5bWVudC4NCk9uZSByb3cgaXMgZXF1aXZhbGVudCB0byBvbmUgcGF5bWVudCBvZiBvbmUgaW5zdGFsbG1lbnQgT1Igb25lIGluc3RhbGxtZW50IGNvcnJlc3BvbmRpbmcgdG8gb25lIHBheW1lbnQgb2Ygb25lIHByZXZpb3VzIEhvbWUgQ3JlZGl0IGNyZWRpdCByZWxhdGVkIHRvIGxvYW5zIGluIG91ciBzYW1wbGUuDQpIb21lQ3JlZGl0X2NvbHVtbnNfZGVzY3JpcHRpb24uY3N2DQoNClRoaXMgZmlsZSBjb250YWlucyBkZXNjcmlwdGlvbnMgZm9yIHRoZSBjb2x1bW5zIGluIHRoZSB2YXJpb3VzIGRhdGEgZmlsZXMuDQpEYXRhDQohW2RhdGEgcmVsYXRpb25zaGlwIGRpYWdyYW1dKGRhdGFfZGlhLnBuZykNCg0KDQoqKjIuIEVEQSB0cmFpbiBhbmQgdGVzdCBkYXRhICoqDQoNCnJlYWRpbmcgdGhlICB0cmFpbiBhbmQgdGVzdCBkYXRhDQpgYGB7cix3YXJuaW5nPUZBTFNFLG1lc3NhZ2U9RkFMU0UsZXJyb3I9RkFMU0V9DQp0cmFpbkxvY2F0aW9uIDwtICJkYXRhL2FwcGxpY2F0aW9uX3RyYWluLmNzdiINCnRlc3RMb2NhdGlvbiA8LSAiZGF0YS9hcHBsaWNhdGlvbl90ZXN0LmNzdiINCmxpYnJhcnkodGlkeXZlcnNlKQ0KdHJhaW5EYXRhIDwtIHJlYWRfY3N2KHRyYWluTG9jYXRpb24pDQp0ZXN0RGF0YSA8LSByZWFkX2Nzdih0ZXN0TG9jYXRpb24pDQpgYGANCg0KDQpgYGB7cix3YXJuaW5nPUZBTFNFLG1lc3NhZ2U9RkFMU0UsZXJyb3I9RkFMU0V9DQp0cmFpbkRhdGENCmBgYA0KYGBge3Isd2FybmluZz1GQUxTRSxtZXNzYWdlPUZBTFNFLGVycm9yPUZBTFNFfQ0KdGVzdERhdGENCmBgYA0KVGhlcmUgYXJlIGFyb3VuZCAzMDc1MTEgb2JzZXJ2YXRpb25zIGluIHRoZSB0cmFpbiBkYXRhIHNldCBhbmQgNDg3NDQgb2JzZXJ2YXRpb24gaW4gdGhlIHRlc3RzIGRhdGEgc2V0IHNvIGFyb3VuZCAxNCUgb2YgdGhlIGRhdGEgaXMgdXNlZCBhcyB0ZXN0IGRhdGEgYW5kIDg2JSBpcyB0cmFpbiBkYXRhLiBUcmFpbiBkYXRhIGFuZCB0ZXN0IGRhdGEgaGF2ZSAxMjEgZmVhdHVyZXMsIGV4dHJhIDEgdGFyZ2V0IHZhcmlhYmxlIGluIHRyYWluIGRhdGEuIE1vc3QgZmVhdHVyZXMgYXJlIG51bWVyaWMgd2l0aCB0eXBlIGludGVnZXIgYW5kIGRvdWJsZSwgc29tZSBvZiB0aGVtIGFyZSBjYXRlZ29yaWNhbCB3aXRoIHR5cGUgY2hhcmFjdGVyLiBUaGVyZSBhcmUgYWxzbyBhIGxvdCBvZiBOQSBpbiB0aGUgZGF0YQ0KDQoNCkNoZWNrIGlmIHRoZXJlIGFyZSBhbnkgY29uc3RhbnQgZmVhdHVyZXMgaW4gdHJhaW4gZGF0YSBzbyB3ZSBjYW4gcmVtb3ZlDQpgYGB7cn0NCmxpYnJhcnkoZHBseXIpDQp0cmFpbkRhdGEgJT4lIHN1bW1hcmlzZV9hbGwoZnVucyhuX2Rpc3RpbmN0KC4pKSkNCmBgYA0KDQpTZWVtIGxpa2UgZXZlcnkgZmVhdHVyZXMgaGF2ZSBhdCBsZWFzdCAyIGRpc3RpbmN0IHZhbHVlIHNvIG5vIGNvbnN0YW50IHZhbHVlDQoNCg0KKlByb2Nlc3MgbnVtZXJpYyBmZWF0dXJlcyoNCmBgYHtyLHdhcm5pbmc9RkFMU0UsbWVzc2FnZT1GQUxTRSxlcnJvcj1GQUxTRX0NCmxpYnJhcnkocHVycnIpDQpsaWJyYXJ5KHRpZHlyKQ0KbGlicmFyeShnZ3Bsb3QyKQ0KbGlicmFyeShkcGx5cikNCmxpYnJhcnkodGlkeWltcHV0ZSkNCiMgZ2V0IHRoZSBhYnNvbHV0ZSB2YWx1ZSBvZiBudW1iZXJpYyBmZWF0dXJlDQpudW1TZXQxIDwtIHRyYWluRGF0YSAlPiUga2VlcChpcy5udW1lcmljKSAlPiUgYWJzIA0KIyByZXBsYWNlIG1pc3NpbmcgdmFsdWUgd2l0aCB0aGUgbWVkaWFuIHZhbHVlIG9mIHRoZSBmZWF0dXJlDQpudW1TZXROb05hIDwtIG51bVNldDEgJT4lIGltcHV0ZV9tZWRpYW4oKSAgDQpudW1TZXROb05hDQpgYGANCg0KYGBge3J9DQpzdW1tYXJ5KG51bVNldE5vTmEpDQpgYGANCm51bVNldE5vTmEgY29udGFpbiAxMDYgZmVhdHVyZXMgaW5jbHVkaW5nIHRoZSB0YXJnZXQgdmFyaWFibGUuIFRoaXMgZGF0YSBoYXZlIG5vIG1pc3NpbmcgdmFsdWUNCg0KDQoNCipQcm9jZXNzIENhdGVnb3JpY2FsIGZlYXR1cmVzKg0KDQoNCmBgYHtyLHdhcm5pbmc9RkFMU0UsbWVzc2FnZT1GQUxTRSxlcnJvcj1GQUxTRX0NCiMgZ2V0IHRoZSBjYXRlZ29yaWNhbCBmZWF0dXJlcw0KY2F0U2V0IDwtIHRyYWluRGF0YSAlPiUga2VlcChpcy5jaGFyYWN0ZXIpDQojIHVzZSB0aGUgbW9kZSBvZiBlYWNoIGZlYXR1cmVzIHRvIHJlcGxhY2UgaXRzIG1pc3NpbmcgdmFsdWUNCmNhdFNldE5vTmEgPC0gY2F0U2V0ICU+JSBpbXB1dGVfbW9zdF9mcmVxDQojIGxhYmVsIGVuY29kaW5nIGNhdGVnb3JpY2FsIGZlYXR1cmVzDQpjYXRTZXRGYWN0b3IgPC0gY2F0U2V0Tm9OYSAlPiUgbXV0YXRlX2FsbChmdW5zKGFzLmZhY3RvcikpDQpjYXRTZXRMYWJlbEVuY29kZSA8LSBjYXRTZXRGYWN0b3IgJT4lIG11dGF0ZV9hbGwoZnVucyhhcy5udW1lcmljKSkNCmNhdFNldExhYmVsRW5jb2RlDQpgYGANCg0KDQoqUGxvdHRpbmcqDQoNCkNyZWF0ZSB0d28gZnVuY3Rpb25zIGZvciBwbG90dGluZyB0aGF0IHdlIHVzZSBsYXRlcg0KYGBge3Isd2FybmluZz1GQUxTRSxtZXNzYWdlPUZBTFNFLGVycm9yPUZBTFNFfQ0KDQpwbG90RGlzdHJpYnR1aW9uIDwtIGZ1bmN0aW9uKGRhdGEscGxvdF90eXBlKXsNCiAgICBkYXRhICU+JQ0KICAgIGdhdGhlcigpICU+JSANCiAgICBnZ3Bsb3QoYWVzKHZhbHVlKSkgKyANCiAgICBmYWNldF93cmFwKH5rZXksc2NhbGVzID0gImZyZWUiKSArIA0KICAgIHBsb3RfdHlwZSgpDQp9DQoNCnBsb3REaXN0cmlidHVpb25Db29yZEZsaXAgPC0gZnVuY3Rpb24oZGF0YSxwbG90X3R5cGUpew0KICAgIGRhdGEgJT4lDQogICAgZ2F0aGVyKCkgJT4lIA0KICAgIGdncGxvdChhZXModmFsdWUpKSArIA0KICAgIGZhY2V0X3dyYXAofmtleSxzY2FsZXMgPSAiZnJlZSIpICsgDQogICAgcGxvdF90eXBlKCkgKyBjb29yZF9mbGlwKCkNCn0NCg0KYGBgDQoNCg0KV2UgYXJlIHNwbGl0dGluZyB0aGUgbnVtZXJpYyBmZWF0dXJlcyBpbnRvIHNtYWxsZXIgY2h1bmsgZm9yIHBsb3R0aW5nDQpgYGB7cn0NCk51bUNvbF8xIDwtIG51bVNldE5vTmEgJT4lIHNlbGVjdCgxOjkpDQpOdW1Db2xfMiA8LSBudW1TZXROb05hICU+JSBzZWxlY3QoMTA6MTkpDQpOdW1Db2xfMyA8LSBudW1TZXROb05hICU+JSBzZWxlY3QoMjA6MjkpDQpOdW1Db2xfNCA8LSBudW1TZXROb05hICU+JSBzZWxlY3QoMzA6MzkpDQpOdW1Db2xfNSA8LSBudW1TZXROb05hICU+JSBzZWxlY3QoNDA6NDkpDQpOdW1Db2xfNiA8LSBudW1TZXROb05hICU+JSBzZWxlY3QoNTA6NTkpDQpOdW1Db2xfNyA8LSBudW1TZXROb05hICU+JSBzZWxlY3QoNjA6NjkpDQpOdW1Db2xfOCA8LSBudW1TZXROb05hICU+JSBzZWxlY3QoNzA6NzkpDQpOdW1Db2xfOSA8LSBudW1TZXROb05hICU+JSBzZWxlY3QoODA6ODkpDQpOdW1Db2xfMTAgPC0gbnVtU2V0Tm9OYSAlPiUgc2VsZWN0KDkwOjk5KQ0KTnVtQ29sXzExIDwtIG51bVNldE5vTmEgJT4lIHNlbGVjdCgxMDA6MTA2KQ0KYGBgDQoNClBsb3QgaGlzdG9ncmFtIG9mIG51bWVyaWMgZmVhdHVyZXMNCmBgYHtyLHdhcm5pbmc9RkFMU0UsbWVzc2FnZT1GQUxTRSxlcnJvcj1GQUxTRSxmaWcud2lkdGg9MjAsZmlnLmhlaWdodD0xMCxvdXQud2lkdGg9IjE5MjBweCIsb3V0LmhlaWdodD0iMTA4MHB4In0NCnBsb3REaXN0cmlidHVpb24oKE51bUNvbF8xKSxnZW9tX2hpc3RvZ3JhbSkNCnBsb3REaXN0cmlidHVpb24oTnVtQ29sXzIsZ2VvbV9oaXN0b2dyYW0pDQpwbG90RGlzdHJpYnR1aW9uKE51bUNvbF8zLGdlb21faGlzdG9ncmFtKQ0KcGxvdERpc3RyaWJ0dWlvbihOdW1Db2xfNCxnZW9tX2hpc3RvZ3JhbSkNCnBsb3REaXN0cmlidHVpb24oTnVtQ29sXzUsZ2VvbV9oaXN0b2dyYW0pDQpwbG90RGlzdHJpYnR1aW9uKE51bUNvbF82LGdlb21faGlzdG9ncmFtKQ0KcGxvdERpc3RyaWJ0dWlvbihOdW1Db2xfNyxnZW9tX2hpc3RvZ3JhbSkNCnBsb3REaXN0cmlidHVpb24oTnVtQ29sXzgsZ2VvbV9oaXN0b2dyYW0pDQpwbG90RGlzdHJpYnR1aW9uKE51bUNvbF85LGdlb21faGlzdG9ncmFtKQ0KcGxvdERpc3RyaWJ0dWlvbihOdW1Db2xfMTAsZ2VvbV9oaXN0b2dyYW0pDQpwbG90RGlzdHJpYnR1aW9uKE51bUNvbF8xMSxnZW9tX2hpc3RvZ3JhbSkNCmBgYA0Kd2UgaGF2ZSBhIGxvdCBvZiByaWdodCBza2V3IGZlYXR1cmVzIGFuZCBiaW5hcnkgZmVhdHVyZXMNCg0KDQoNCg0KV2UgYXJlIHNwbGl0dGluZyB0aGUgY2F0ZWdvcmljYWwgZmVhdHVyZXMgaW50byBzbWFsbGVyIGNodW5rIGZvciBwbG90dGluZw0KYGBge3J9DQptdXRpVHlwZUNvbCA8LSBjKCJOQU1FX0NPTlRSQUNUX1RZUEUiLCJOQU1FX0VEVUNBVElPTl9UWVBFIiwiTkFNRV9GQU1JTFlfU1RBVFVTIiwiTkFNRV9IT1VTSU5HX1RZUEUiLCJOQU1FX0lOQ09NRV9UWVBFIiwiTkFNRV9UWVBFX1NVSVRFIiwiT0NDVVBBVElPTl9UWVBFIiwiV0VFS0RBWV9BUFBSX1BST0NFU1NfU1RBUlQiLCJGT05ES0FQUkVNT05UX01PREUiLCJXQUxMU01BVEVSSUFMX01PREUiLCJIT1VTRVRZUEVfTU9ERSIpDQpjYXRDb2wxIDwtIGNhdFNldE5vTmEgJT4lIHNlbGVjdChtdXRpVHlwZUNvbCkNCmNhdENvbDIgPC0gY2F0U2V0Tm9OYSAlPiUgc2VsZWN0KC1tdXRpVHlwZUNvbCwtIk9SR0FOSVpBVElPTl9UWVBFIikNCmNhdENvbDMgPC0gY2F0U2V0Tm9OYSAlPiUgc2VsZWN0KCJPUkdBTklaQVRJT05fVFlQRSIpDQpgYGANCg0KUGxvdCBiYXIgY2hhcnQgb2YgbnVtZXJpYyBmZWF0dXJlcw0KYGBge3Isd2FybmluZz1GQUxTRSxtZXNzYWdlPUZBTFNFLGVycm9yPUZBTFNFLGZpZy53aWR0aD0yMCxmaWcuaGVpZ2h0PTgsb3V0LndpZHRoPSIxOTIwcHgiLG91dC5oZWlnaHQ9IjEwODBweCJ9DQpwbG90RGlzdHJpYnR1aW9uQ29vcmRGbGlwKGNhdENvbDEsZ2VvbV9iYXIpDQpwbG90RGlzdHJpYnR1aW9uKGNhdENvbDIsZ2VvbV9iYXIpDQpwbG90RGlzdHJpYnR1aW9uQ29vcmRGbGlwKGNhdENvbDMsZ2VvbV9iYXIpDQoNCmBgYA0KDQoNCg0KDQpXZSBwaWNrcyBzb21lIGZlYXR1cmVzIHRoYXQgd2UgdGhpbmsgaXQgaXMgaW1wb3J0YW50IHRvIGdyYXBoIHRoZSBjb3JyZWxhdGlvbiANCmBgYHtyLHdhcm5pbmc9RkFMU0UsbWVzc2FnZT1GQUxTRSxlcnJvcj1GQUxTRSxmaWcud2lkdGg9MjAsZmlnLmhlaWdodD04LG91dC53aWR0aD0iMTkyMHB4IixvdXQuaGVpZ2h0PSIxMDgwcHgifQ0KY29yRmVhdHVyZXMgPC0gbnVtU2V0Tm9OYSAlPiUgc2VsZWN0KFRBUkdFVCxDTlRfQ0hJTERSRU4sQU1UX0lOQ09NRV9UT1RBTCxBTVRfQ1JFRElULEFNVF9BTk5VSVRZLEFNVF9HT09EU19QUklDRSxDTlRfRkFNX01FTUJFUlMsT0JTXzMwX0NOVF9TT0NJQUxfQ0lSQ0xFLERFRl8zMF9DTlRfU09DSUFMX0NJUkNMRSxPQlNfNjBfQ05UX1NPQ0lBTF9DSVJDTEUsREVGXzYwX0NOVF9TT0NJQUxfQ0lSQ0xFKQ0KDQoNCmxpYnJhcnkoY29ycnBsb3QpDQpNIDwtIHJvdW5kKGNvcihjb3JGZWF0dXJlcyksMikNCmNvcnJwbG90KE0sIG1ldGhvZCA9ICJudW1iZXIiKQ0KYGBgDQpJZiB0d28gZmVhdHVyZXMgYXJlIGhpZ2hseSBjb3JyZWxhdGVkIHRvIGVhY2ggb3RoZXIgd2UgY2FuIHJlbW92ZSBvbmUgb2YgdGhlbSBsYXRlciB3aGVuIGJ1aWxkaW5nIHRoZSBtb2RlbA0KDQpgYGB7cn0NCnJlYWR5RGF0YSA8LSBjYmluZChudW1TZXROb05hLGNhdFNldExhYmVsRW5jb2RlKQ0KbGlicmFyeShjYXJldCkNCm51bU9ubHkgPC0gcmVhZHlEYXRhICU+JSBrZWVwKGlzLm51bWVyaWMpDQp3aGljaChpcy5uYShudW1Pbmx5KSwgYXJyLmluZD1UUlVFKQ0KDQpudW1Pbmx5WzI2NjM2NywgMTFdDQoNCg0Kd3JpdGUuY3N2KHJlYWR5RGF0YSwgZmlsZSA9ICJkYXRhL3RyYWluUmVhZHkuY3N2Iixyb3cubmFtZXM9RkFMU0UpDQpgYGANCm1ha2UgdGVzdCBkYXRhIHJlYWR5IGxpa2Ugd2UgZGlkIHdpdGggdHJhaW4gZGF0YQ0KYGBge3J9DQp0ZXN0TnVtU2V0IDwtIHRlc3REYXRhICU+JSBrZWVwKGlzLm51bWVyaWMpICU+JSBhYnMNCnRlc3ROdW1TZXROb05hIDwtIHRlc3ROdW1TZXQgJT4lIGltcHV0ZV9tZWRpYW4oKSANCg0KdGVzdENhdFNldCA8LSB0ZXN0RGF0YSAlPiUga2VlcChpcy5jaGFyYWN0ZXIpDQp0ZXN0Q2F0U2V0Tm9OYSA8LSB0ZXN0Q2F0U2V0ICU+JSBpbXB1dGVfbW9zdF9mcmVxDQoNCnRlc3RDYXRTZXRGYWN0b3IgPC0gdGVzdENhdFNldE5vTmEgJT4lIG11dGF0ZV9hbGwoZnVucyhhcy5mYWN0b3IpKQ0KdGVzdENhdFNldExhYmVsRW5jb2RlIDwtIHRlc3RDYXRTZXRGYWN0b3IgJT4lIG11dGF0ZV9hbGwoZnVucyhhcy5udW1lcmljKSkNCg0KcmVhZHlUZXN0RGF0YSA8LSBjYmluZCh0ZXN0TnVtU2V0Tm9OYSx0ZXN0Q2F0U2V0TGFiZWxFbmNvZGUpDQp3cml0ZS5jc3YocmVhZHlUZXN0RGF0YSwgZmlsZSA9ICJkYXRhL3Rlc3RSZWFkeS5jc3YiLHJvdy5uYW1lcz1GQUxTRSkNCg0KYGBgDQoNCg0KDQoNCg0KDQoNCg0KDQoNCg0K